6 research outputs found

    Manticore: Hardware-Accelerated RTL Simulation with Static Bulk-Synchronous Parallelism

    Full text link
    The demise of Moore's Law and Dennard Scaling has revived interest in specialized computer architectures and accelerators. Verification and testing of this hardware heavily uses cycle-accurate simulation of register-transfer-level (RTL) designs. The best software RTL simulators can simulate designs at 1--1000~kHz, i.e., more than three orders of magnitude slower than hardware. Faster simulation can increase productivity by speeding design iterations and permitting more exhaustive exploration. One possibility is to use parallelism as RTL exposes considerable fine-grain concurrency. However, state-of-the-art RTL simulators generally perform best when single-threaded since modern processors cannot effectively exploit fine-grain parallelism. This work presents Manticore: a parallel computer designed to accelerate RTL simulation. Manticore uses a static bulk-synchronous parallel (BSP) execution model to eliminate runtime synchronization barriers among many simple processors. Manticore relies entirely on its compiler to schedule resources and communication. Because RTL code is practically free of long divergent execution paths, static scheduling is feasible. Communication and synchronization no longer incur runtime overhead, enabling efficient fine-grain parallelism. Moreover, static scheduling dramatically simplifies the physical implementation, significantly increasing the potential parallelism on a chip. Our 225-core FPGA prototype running at 475 MHz outperforms a state-of-the-art RTL simulator on an Intel Xeon processor running at \approx 3.3 GHz by up to 27.9×\times (geomean 5.3×\times) in nine Verilog benchmarks

    Auto-Partitioning Heterogeneous Task-Parallel Programs with StreamBlocks

    No full text
    FPGAs play an increasing role in the reconfgurable accelerator landscape. A key challenge in designing FPGA-based systems is partitioning computation between processor cores and FPGAs. An appropriate division of labor is difcult to predict in advance and requires experiments and measurements. When an investigation requires rewriting part of the system in a new language or with a new programming model, its high cost can delay design-space exploration. A single-language system with an appropriate programming model and compiler that targets both platforms transforms this tedious exploration to a simple recompile with new compiler directives. This work introduces StreamBlocks, a unifed open-source software/FPGA compiler and runtime that takes dataflow programs written in Cal, and automatically partitions them across heterogeneous CPU/FPGA platforms. The explicit task-parallel semantics of dataflow allows our compiler to simultaneously take advantage of thread parallelism on software and spatial parallelism on hardware. StreamBlocks is augmented with a profle-guided autopartitioning tool that helps identify the best hardware-software partitions. We demonstrate the capability of our compiler in fnding the right balance between hardware and software execution on both a high-end datacenter accelerator card and an embedded board. Our experiments exhibit a 4-7× speedup over trivial partitions. This speedup is achieved automatically with zero code modifcations

    A CMOS CURRENT-MODE LOW POWER RMS-TO-DC CONVERTER

    Get PDF
    In this paper a low-power current-mode RMS-to-DC converter is proposed. The converter includes two-quadrant squarer/divider and the first-order low-pass filter cell, both of them use MOS translinear loops. The RMS-to-DC converter has low power consumption (\u3c 0.75μW), low supply voltage (0.8 V), wide input range (from 40 nA to 500 nA), low relative error (\u3c 3 %), and low circuit complexity. Comparing the proposed circuit with two other current-mode circuits shows that the former outperforms the latters in terms of power dissipation, supply voltage, and complexity. Simulation results by HSPICE show high performance of the circuit and confirm the validity of the proposed design technique

    Wavelet-enhanced convolutional neural network: a new idea in a deep learning paradigm

    Get PDF
    Purpose: Manual brain tumor segmentation is a challenging task that requires the use of machine learning techniques. One of the machine learning techniques that has been given much attention is the convolutional neural network (CNN). The performance of the CNN can be enhanced by combining other data analysis tools such as wavelet transform. Materials and methods: In this study, one of the famous implementations of CNN, a fully convolutional network (FCN), was used in brain tumor segmentation and its architecture was enhanced by wavelet transform. In this combination, a wavelet transform was used as a complementary and enhancing tool for CNN in brain tumor segmentation. Results: Comparing the performance of basic FCN architecture against the wavelet-enhanced form revealed a remarkable superiority of enhanced architecture in brain tumor segmentation tasks. Conclusion: Using mathematical functions and enhancing tools such as wavelet transform and other mathematical functions can improve the performance of CNN in any image processing task such as segmentation and classification

    Polypyrrole/multiwall carbon nanotube nanocomposites electropolymerized on copper substrate

    No full text
    Abstract Polypyrrole/multiwall carbon nanotube (PPy/MWCNT) nanocomposites were successfully synthesized by electropolymerization of MWCNTdispersed pyrrole solution on the surface of copper electrodes. The obtained nanocomposites were characterized with scanning electron microscopy (SEM), linear sweep voltammetry (LSV) and thermal gravimetric analysis (TGA). Polypyrrole structures which embraced the MWCNTs led to the formation of nanocomposite striated parallel walls. MWCNTs acted as appropriate substrates for electrodeposition of polypyrrole particulate structures and high yield synthesis of PPy was observed on them. Smooth PPy/MWCNT nanocomposite films were obtained on Cu electrodes by decreasing the potential scan rate. Thermogravimetric analysis showed that MWCNTs increased the thermal stability of polypyrrole
    corecore